The Annotated S4 (한국어)

Efficiently Modeling Long Sequences with Structured State Spaces

Albert Gu, Karan Goel, and Christopher Ré.

Sasha RushSidd Karamcheti 의 블로그와 라이브러리 by , v3

Structured State Space for Sequence Modeling (S4) 아키텍쳐는 시각, 언어 및 오디오에서 매우 긴 시퀀스 모델링 작업에 대한 새로운 접근방식으로, 수만 단계에 걸친 의존성을 담을 수 있는 능력을 보여줍니다. 특히 인상적인 것은 Long Range Arena 벤치마크에서의 결과로 최대 16,000+ 이상의 요소에 대한 시퀀스에서 높은 정확도로 추론할 수 있는 능력을 보여줍니다.

The paper is also a refreshing departure from Transformers, taking a very different approach to an important problem-space. However, several of our colleagues have also noted privately the difficulty of gaining intuition for the model. This blog post is a first step towards this goal of gaining intuition, linking concrete code implementations with explanations from the S4 paper – very much in the style of the annotated Transformer. Hopefully this combination of code and literate explanations helps you follow the details of the model. By the end of the blog you will have an efficient working version of S4 that can operate as a CNN for training, but then convert to an efficient RNN at test time. To preview the results, you will be able to generate images from pixels and sounds directly from audio waves on a standard GPU.

Table of Contents